Rank | Count | Beginning |
---|---|---|
8335 | 829 | Цилӓжӹ |
2956 | 553 | Кытшы |
6520 | 265 | Ти |
7645 | 201 | Хала |
7837 | 169 | Халан |
8227 | 108 | Цилӓжӓт |
3804 | 101 | Лӹмжӹм |
2742 | 92 | Кымдецшӹ |
7117 | 92 | Тӹдӹ |
6571 | 80 | Тидӹ |
2870 | 78 | Кырык |
6415 | 71 | Территорижӹ |
6288 | 69 | Тенге |
7084 | 69 | Тӹ |
5927 | 64 | Сола |
4522 | 63 | Но |
1063 | 52 | Вес |
2229 | 46 | Кого |
2651 | 46 | Кӹзӹт |
5749 | 46 | Сек |
9679 | 45 | Ӹлӹзӹ |
7194 | 44 | Тӹдӹн |
9492 | 43 | Штатын |
6016 | 38 | Солашты |
3919 | 37 | Лӹмӹн |
4628 | 36 | Нӹнӹ |
5321 | 35 | Пӹтӓриш |
6338 | 35 | Тенгеок |
4090 | 34 | Мары |
4451 | 31 | Нелӹцшӹ |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV